You Can't Beat Frequency (Unless You Use Linguistic Knowledge) - A Qualitative Evaluation of Association Measures for Collocation and Term Extraction
نویسندگان
چکیده
In the past years, a number of lexical association measures have been studied to help extract new scientific terminology or general-language collocations. The implicit assumption of this research was that newly designed term measures involving more sophisticated statistical criteria would outperform simple counts of cooccurrence frequencies. We here explicitly test this assumption. By way of four qualitative criteria, we show that purely statistics-based measures reveal virtually no difference compared with frequency of occurrence counts, while linguistically more informed metrics do reveal such a marked difference.
منابع مشابه
Presenting a Hybrid Approach based on Two-stage Data Envelopment Analysis to Evaluating Organization Productivity
Measuring the performance of a production system has been an important task in management for purposes of control, planning, etc. Lord Kelvin said :“When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind.” Hence, manag...
متن کاملThe Domain of the semantics of ‘promise’ in the Holy Quran
Semantics is a part of linguistic by which it can be analyzed the meaning of the words and sentences of a text and identified the part of speech with regard to semantics. This is a descriptive-analytic research and it deals with studying the meaning of ‘promise’ in the Holy Quran based on principles of semantics with a collocation approach by library methodology. Also, by virtue of ...
متن کاملAn Exploratory Study on the Use of 'I Love You' in the American Context
This study explores the use of the English locution I love you in the American context. The data were collected through a focus discussion group and a survey questionnaire. 120 college undergraduate students from a large public American university participated in the study with 28 attending the focus discussion group and 92 completing the survey questionnaire. The findings indicated th...
متن کاملTCtract-A Collocation Extraction Approach for Noun Phrases Using Shallow Parsing Rules and Statistic Models
This paper presents a hybrid method for extracting Chinese noun phrase collocations that combines a statistical model with rule-based linguistic knowledge. The algorithm first extracts all the noun phrase collocations from a shallow parsed corpus by using syntactic knowledge in the form of phrase rules. It then removes pseudo collocations by using a set of statistic-based association measures (...
متن کاملنشانهشناسی غزلی از مولانا
Mowlanā Jalāl al-Din muhammad Rūmi(604- 672 h) is one of the iranian poets and Sufis. For those who don’t know Mowlavi well understanding the concepts in his gazals is a demanding task. The existence of mystical expressions, his broad knowledge of various Islamic sciences, and cultural and literal traditions can be regarded as causes of this. In spite of the growing trend of researching o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006